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1.   INTRODUCTION 

The  logical  structures  of  certain  types  of  problems  may  be 
represented  by  decision  trees.  A  decision  tree  is  a  binary  tree  whose 
internal  nodes  represent  points  in  time  at  which  decisions  must  be  made 
to  take  either  the  left  or  right  branches  of  these  nodes.  The  root 
of  a  decision  tree  represents  the  status  of  a  problem  before  any  decisions 
have  been  made,  and  the  leaves  of  the  tree  represent  all  possible  out- 
comes of  the  problem  (which  the  tree  represents)  which  could  result  from 
all  possible  combinations  of  decisions  made  at  the  internal  nodes. 

There  has  been  a  great  deal  of  study  done  on  related  tree 
problems.  Two  major  areas  of  study  have  been  optimal  search  trees  and 
Huffman's  tree  constructions  for  minimum  redundancy  codes.  Decision 
trees  are,  in  fact,  a  generalization  of  these  other  types  of  trees, 
and  there  are  many  optimality  problems  which  arise  in  the  area  of 
decision  trees  which  cannot  be  solved  using  the  algorithms  derived  for 
handling  these  other,  related  tree  problems. 

The  problem  of  decision  trees  arises  in  the  study  of  decision 
tables  and  in  the  conversion  of  limited-entry  decision  tables  to 
decision  trees  for  the  purpose  of  computer  programming.  Algorithms 
have  been  found  for  converting  a  decision  table  to  a  computer  program 
which  uses  a  minimum  amount  of  storage.   The  storage  requirement  for 
the  program  is  minimized  by  using  the  given  decision  table  to  construct 
a  corresponding  decision  tree  which  has  a  minimum  number  of  nodes  and 


constructing  the  program  from  this  optimal  decision  tree.  There  are 
also  algorithms  for  converting  a  decision  table  to  a  computer  program 
which  executes  in  a  minimum  amount  of  time.  The  execution  time  is 
minimized  by  assigning  a  probability,  or  frequency  of  occurrence,  to 
each  possible  outcome  (column)  in  the  table  and  constructing  the 
corresponding  decision  tree  so  that  the  most  likely  outcomes  are  resolved 
at  relatively  low  level  nodes  of  the  tree  and  less  likely  outcomes  are 
resolved  at  the  higher  level  nodes  of  the  tree.  The  decision  tree  for 
minimizing  execution  time  will  probably  contain  more  than  the  minimum 
number  of  nodes  because  frequently  occurring  outcomes  will  be  resolved 
in  as  few  decision  steps  (nodes)  as  possible  even  if  this  necessitates 
additional  decision  steps  for  resolving  outcomes  which  seldom  occur. 
[For  discussion  of  the  two  algorithms  mentioned  here,  see  Pollack, 
"Conversion  of  Limited  Entry  Decision  Tables  to  Computer  Programs." 
Communications  of  the  ACM,  Vol.  8,  No.  11,  November  1965,  pp.  677-682.] 

The  decision  table  application  of  decision  trees  is  not  entirely 
general.  Below  is  an  example  of  a  decision  table: 
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Figure  1 


A  particular  case,  C  ,  is  determined  "by  whether  each  of  certain 
predicates  (conditions)  p, ,  p  ,  ...,  p  holds  (entry  in  table  is  Y), 
does  not  hold  (entry  in  table  is  N),  or  does  not  apply  (entry  in 
table  is  — ) .  Each  — ,  or  "don't-care"  entry  under  a  particular  case 
C  in  the  table  is  a  substitute  for  listing  two  configurations  (assign- 

K. 

ments  of  Y  or  N  to  each  p.)  of  the  P.,  p~,  ....  p  in  that  case  (one 

1         1*  2'   '  m 

with  a  Y  where  the  dash  occurs,  the  other  with  an  N  where  the  dash  occurs) 
Thus,  cases  containing  one  or  more  don't-care  entries  actually  include 
several  configurations  of  p.,...,  p  •  However,  the  possible  combina- 
tions of  configurations  of  p, , ...,  p  which  can  be  represented  by  a 
single  column  containing  don't-care  entries  is  only  a  subset  of  all 
possible  combinations  of  configurations  which  could  be  included  in  a 
case.  There  are  certain  decision  tree  applications  for  which  it  is 
necessary  to  be  able  to  assign  any  combination  of  configurations  of 
the  p  ,  ...,  p  to  each  case  (each  configurations  will  belong  to  only  one 
C,  ) •  Construction  of  optimal  decision  trees  for  each  applications  cannot 
be  accomplished  by  using  the  previously-mentioned  algorithms  devised 
for  converting  decision  tables  to  decision  trees. 

One  such  general  application  of  decision  trees  arises  in  the 
problem  of  trying  to  optimize  the  efficiency  of  branching  in  a  computer 
program.  Consider  the  following  simple  example  which  is  illustrated 
in  Figure  2:  We  are  working  with  the  x  and  y  coordinates  of  points 
in  the  Euclidean  plane.  Assume  x  and  y  are  non-zero.  We  wish  to  branch 
to  different  parts  of  the  program  depending  upon  whether  we  have  case  1 
(the  point  is  in  the  first  quadrant),  case  2  (the  point  is  in  the  second 
quadrant),  or  case  3  (the  point  is  in  the  third  or  fourth  quadrants). 
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Case  2 
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Figure  2 


->  x 


The  two  possible  ways  to  program  this  branching  process  are  shown  in 
the  two  decision  trees  below. 


no 


yes 


yes 


Figure  3& 


Figure  3b 


Since  the  number  of  tests  required  to  resolve  any  case  in  Figure  3b  is 
always  less  than  or  equal  to  the  number  of  tests  required  to  resolve  a 
case  in  Figure  3a,  Figure  3b  represents  the  preferable  programming  logic 


For  this  example,  there  were  only  two  possible  decision  trees,  so  it 
was  convenient  to  examine  "both  trees  and  choose  the  better  of  the  two. 
However,  the  relatively  complex  logic  of  most  problems  programmed  for 
the  computer  makes  such  a  trial-and-error  analysis  highly  impractical. 
Therefore,  we  would  like  to  devise  a  systematic  method  for  analyzing 
a  programming  problem  and  creating  a  decision  tree  which  will  optimize 
its  branching  processes.  The  remainder  of  this  paper  will  be  concerned 
with  this  problem. 

A  programming  branching  problem  can  be  represented  by  a  table 
from  which  a  decision  tree  can  be  constructed.  For  example,  the  problem 
represented  by  Figure  2  can  be  represented  by  the  following  table : 
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Figure  k 


There  are  two  predicates  (conditions),  p,  and  p?,  each  of  which  may  be 
either  true  or  false.  Therefore,  there  are  2  =  k   possible  combinations 
for  these  two  predicates.   If  p  and  p?  are  both  true,  we  have  case  C, ; 
if  p,  is  false  and  pp  is  true,  we  have  case  Cp;  and  if  pp  is  false 
(and  p  either  true  or  false),  we  have  case  C_. 


If  we  make  use  of  the  "don't-care"  symbol  in  constructing 
our  table,  then  the  table  in  Figure  h   could  be  represented  as  the 
decision  table  in  Figure  5.   (More  complicated  problems  generally 
will  not  be  representable  in  decision  table  form.)  However,  for  the 
present,  we  shall  choose  not  to  use  the  "don't-care"  representation; 
therefore,  if  we  have  n  predicates  our  table  will  always  contain  2 
columns.  This  means  that  if  certain  combinations  of  ones  and  zeroes 
do  not  concern  us  in  a  particular  problem,  they  still  must  be  entered 
in  the  table.  However,  all  these  combinations  may  be  grouped  into  a 
single  C  ,  the  "else"  case. 
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Our  present  study  is  based  on  two  assumptions:   (l)  that  the 
cost  of  making  a  decision  remains  constant  over  all  nodes  of  the 
decision  tree,  i.e.  the  truth  value  of  a  particular  p.  may  be 
determined  with  equal  ease  for  all  p.,  and  (2)  each  case  occurs  with 
equal  probahility.  When  these  assumptions  hold  true,  the  best  branching 
logic  for  a  computer  program  will  be  that  for  which  the  corresponding 
decision  tree  has  a  minimal  number  of  nodes.  We  shall  refer  to  the  tree 
with  the  minimal  number  of  nodes  as  the  optimal  tree.  There  may  be  more 
than  one  optimal  tree  for  a  given  problem. 


The  search  for  a  systematic  method  for  finding  the  optimal 
tree  has  not  yet  yielded  an  algorithm  which  is  guaranteed  to  produce 
the  optimal  tree  on  the  first  try.  However,  I  have  found  several 
heuristic  algorithms  which,  when  applied,  greatly  reduce  the  effort 
that  is  required  to  produce  the  optimal  tree  by  trial-and-error  methods. 
These  heuristic  algorithms  will  be  discussed  in  Chapters  2  and  3- 

Let  us  first  consider  the  obvious  algorithm  of  finding  an 
optimal  decision  tree  by  exhaustive  search.   Consider  the  problem 
illustrated  by  the  following  table: 
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Figure  6 


Here  n=4  so  we  have  2  =16  columns  in  the  table.  From  this  table,  we 
shall  construct  a  tree  which  will  have  the  following  structure: 


To  be  filled 
with  some  p. 


Resolved  Cases 


Figure  7 


The  maximum  number  of  nodes  the  tree  can  contain  is  15  (=  2  -1,  where  n 

is  the  number  of  predicates),  but  it  may  contain  fewer  than  15  nodes  if 

parts  of  some  cases  (some  columns  from  the  table)  can  be  resolved  without 

testing  all  of  the  p. .   If  we  conduct  an  exhaustive  trial-and-error  search 

l 

for  the  optimal  tree,  how  many  trees  must  we  check?  Realizing  that  each 

root-to-leaf  path  can  contain  a  particular  p.  at  most  once,  we  see  that 

2 
we  have  h   choices  for  the  level  zero  node,  3  choices  for  the  level  one 

k  8 

nodes,  2  choices  for  the  level  two  nodes,  and  1  choices  for  the  level 

three  nodes.  Thus,  the  maximum  number  of  trees  that  would  have  to  be 

P  h     R 
inspected  is  k-3    *2  »1  =  576  trees.   Of  course,  if  some  of  the  trees 

have  fewer  than  15  nodes,  there  will  be  fewer  trees  to  investigate,  but 
the  number  will  remain  quite  large--too  large,  in  fact,  to  make  trial- 
and-error  investigation  feasible  even  when  n  is  as  small  as  k.     When  the 
exhaustive  trial-and-error  search  was  programmed  and  run  on  the  computer, 


9 
it  was  found  that  for  this  example,  kQk   trees  had  to  be  inspected  to 
discover  that  there  exist  two  optimal  trees  with  six  nodes  each.  In 
general,  when  there  are  n  predicates,  the  upper  bound  for  the  number 
of  trees  which  must  be  inspected  to  ensure  obtaining  the  optimal  tree 
is  given  by: 

Maximum  number  of  trees  to  inspect  = 

„(2°).(„.i)(2l).(„-2)(22).....(„.k)(2k).  ...  .(i)^"'1*) 

Obviously,  an  exhaustive  trial-and-error  search  for  the  optimal  decision 
tree  requires  too  much  work  to  be  feasible. 


10 

2.   HEURISTIC  ALGORITHMS 


2.1.   Algorithm  1  (Constant  Cases) 

Step  1;  To  determine  which  p.  to  select  when  constructing 
the  decision  tree,  look  at  the  table  and  choose  the  p.  for  which  the 
most  cases  are  constant  (either  all  zeroes  or  all  ones),  if  such  a  p. 
exists.  For  example,  in  Figure  k,    for  p, ,  cases  C,  and  Cp  are  constant, 
hut  for  p_,  all  three  cases  are  constant.  Therefore,  pp  should  be 
chosen  first.   Indeed,  as  shown  by  Figures  3a  and  3b,  Po  is  the 
preferable  choice . 

Step  2:   If  no  such  p.  exists,  proceed  as  though  by  the 
exhaustive  search  method  and  apply  this  "constant  case"  algorithm  to 
resulting  subtables  whenever  possible.  Whenever  this  algorithm  yields 
a  "best"  choice  for  a  particular  node,  no  other  p.'s  need  to  be  tried 
for  that  node  (unless,  of  course,  in  proceeding  through  the  exhaustive 
search  we  change  a  p.  which  lies  on  the  path  between  the  root  and  our 
particular  node).  Therefore,  we  eliminate  looking  at  some  of  the  trees 
we  might  otherwise  have  considered  when  searching  exhaustively  for  the 
optimal  tree.  Note  that  when  two  or  more  p. 's  have  the  same  number  of 
constant  cases,  each  of  these  P. '  s  should  be  tried;  they  might  not  all 
prove  to  be  equally  good  choices.  To  illustrate  Step  2,  refer  to  Figure  6 
There  are  no  constant  cases  for  any  of  the  p.  's.  Therefore,  proceed 
as  by  the  exhaustive  search  method.  Suppose  p  is  chosen  to  be  the  root 
of  the  tree.  Construct  two  subtables: 
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To  obtain  the  left  descendant  of  the  root  (p  =  0),  note  that  "both 
Pp  and  p~  have  two  constant  cases  so  both  of  these  predicates  must  be 
tried,  but  p.  need  not  be  tried.  To  obtain  the  right  descendant  of  the 
root  (p  =  l),  note  that  p?  has  three  constant  cases,  more  than  either 
p  or  p.  ,  so  choose  pp .  p_  and  p.  never  need  to  be  considered.  The  tree 
now  looks  like  this: 


(Must  also 
try  p  here) 


Figure  9 


Now  construct  two  more  subtables  for  each  bottom  node  and  continue 
constructing  the  tree  until  all  the  leaves  of  the  tree  are  resolved 
cases.  After  constructing  all  possible  trees  with  p,  at  the  root 
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(eliminating  the  consideration  of  some,  of  course,  by  using  the 
algorithm),  go  back  and  do  the  same  for  roots  of  pp,  p  ,  and  Pi  . 
(Since  the  algorithm  did  not,  in  this  example,  yield  any  information 

about  which  p.  -would  be  a  best  choice  for  the  root  of  the  decision 

1 

tree,  all  roots  must  be  tried.)  The  optimal  decision  tree  is  the  best 
tree  found  by  the  above-described  search.  When  Algorithm  1  was 
programmed  and  run  on  the  computer,  it  was  found  that  for  the  above 
example,  36  trees  were  inspected  to  find  the  two  optimal  trees  with 
six  nodes  each.  This  is  in  contrast  to  the  U8U  trees  examined  in  the 
exhaustive  search. 

2.2.   Algorithm  2  (Weight  Function) 

Step  1;   For  each  predicate,  p.,  calculate  a  weight  given  by 

n-1 
WTX  =     Z        Z   K  •  PR 
all  cases  k=l 

where  (a)  P  is  the  number  of  groups  of  2  zeroes  or  ones  there  are 
within  a  case,  (b)  no  zeroes  or  ones  are  counted  more  than  once,  and 

(c)  longer  strings  are  counted  over  shorter  strings  (i.e.  four  zeroes 

2  1 

would  be  counted  as  one  group  of  2  zeroes,  not  two  groups  of  2  zeroes). 

For  example,  if  one  case  within  a  p.  contains  six  zeroes  and  three  ones, 

the  weight  for  that  case  would  be  given  by: 


wt  ==  2    •  1         v    +    „  1    •  2       - 

poop  s  v  S*  ry\  x 

one  group  of  \      / one   group  of  2      zeroes     \     (one   1  left  over) 
2^  zeroes        J    (   +  one  group  of  2^ones        J 
/      I    =  two  groups   of  2>P  zeroes/ 
\or   ones  / 
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To  find  WT.,  for  p.,  sum  the  weights  over  all  cases 
1      i7 


Step  2;  Choose  the  predicate  for  which  the  weight  is  a 
maximum.   If  more  than  one  predicate  has  the  maximum  weight,  all 
those  with  the  maximum  weight  must  be  tried  to  ensure  that  the  next 
node  of  the  decision  tree  will  be  filled  with  the  "best"  predicate. 

We  shall  evaluate  the  weights  for  all  the  predicates  in 
the  example  in  Figure  6 . 

p.:  WI1  =  (2-1  +  1-1)  +  (1-1)  +  (2-1  +  1-1)  =  7 

P2:  WT2  =  (2-1  +  1-1)  +  (1-1)  +  (2-1  +  1-1)  =  7 

P3:  WT3  =  (2-1  +  1-1)  +  (1*1)  +  (2-1  +  1-1)  =  7 

p^:  WIk   =  (2-1  +  1-1)  +  (1-1)  +  (1-2)       =  6 

The  algorithm  happens  not  to  be  decisive  for  choosing  the  root  of  the 
decision  tree,  though  it  does  show  that  p.  would  be  the  worst  choice 
for  the  root  (so  p.  need  not  be  considered  as  a  possible  root).  When 
Algorithm  2  was  run  on  the  computer,  22  trees  were  inspected  to  find 
the  two  optimal  trees  with  six  nodes  each.  This  is  somewhat  better 
than  the  36  trees  inspected  when  using  Algorithm  1. 

2.3.   Algorithm  3  (Weight  Function) 


given  by 


Step  1:  For  each  predicate,  p.,  calculate  a  weight 


n-1 
WT2  =    Z       Z   K   •  PR 
all  cases  k=2 


where  the  notation  is  the  same  as  that  described  for  Algorithm  2 


Ik 

Step  2;   Same  as  Step  2  for  Algorithm  2 . 
Again,  we  shall  evaluate  the  weights  for  the  predicates 
in  Figure  6. 

Pl:  (22-l  +  12-1)  +  (12-1)  +  (22-l  +  12-1)  =  11 

P2:  (22-l  +  12-1)  +  (12-1)  +  (22-l  +  12-1)  =  11 

P3:  (22-l  +  12-1)  +  (12.1)  +  (22-l  +  12-1)  =  11 

Pk:  (22-l  +  12-1)  +  (12-1)  +  (l2-2)  =  8 

Algorithm  3  has  the  effect  of  weighing  more  heavily  the  larger  groups 
of  ones  and  zeroes  than  does  Algorithm  2.  As  with  the  earlier  algorithms, 
Algorithm  3  is  indecisive  for  selecting  a  root  in  the  example  shown 
here.  However,  when  this  algorithm  was  run  on  a  computer,  it  was  found 
that  only  eight  trees  needed  to  be  investigated  to  find  the  two  optimal 
trees  with  six  nodes  each.  This  is  a  considerable  improvement  over  the 
22  trees  which  had  to  be  investigated  when  Algorithm  2  was  used. 

The  three  algorithms  plus  the  exhaustive  search  were  programmed 
for  the  computer  and  run  for  eight  different  problem  situations,  each 
with  n=U  (hand  testing  is  fairly  easy  for  n<3)«   The  eight  trials  were  not 
random  problem  situations,  but  rather,  were  carefully  selected  to 
represent  as  wide  a  range  of  different  types  of  situations  as  possible. 
The  number  of  cases  varied  from  two  to  five.  The  results  are 
summarized  below: 
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As  the  table  shows,  the  algorithms  did  not  always  yield  all  the 
optimal  trees.  However,  that  is  not  a  problem  because  any  optimal 
tree  is  considered  to  be  as  good  as  any  other.  Algorithm  2  nearly 
always  required  examination  of  fewer  trees  than  Algorithm  1,  and 
Algorithm  3  was  always  at  least  as  good,  and  often  better  than 
Algorithm  2  in  this  respect.   The  starred  (*)  entries  in  the  table 
indicate  instances  in  which  an  algorithm  failed  to  find  an  optimal 
tree.   If  one  is  looking  for  accuracy  in  finding  the  optimal  tree, 
Algorithm  2  appears  best.  However,  if  one  is  more  interested  in  speed 
and  cares  only  that  he  find  a  "good"  (and  not  necessarily  a  "best") 
tree,  Algorithm  3  is  preferable.   In  any  case,  all  the  algorithms 
reduced  substantially  the  exhaustive  search  effort,  and  all  found 
good,  though  not  always  optimal,  trees.  The  algorithms  remain 
untested  for  n  >k  . 
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3-   AN  IMPROVED  HEURISTIC  ALGORITHM 


After  extensive  experimentation  with  Algorithms  1,  2,  and 
3,  I  became  convinced  that  the  single  most  important  consideration 
when  selecting  a  "best"  predicate  is  the  number  of  occurrences  of  zeroes 
or  ones  in  groups  of  powers  of  two  within  cases.  This  suggests  consider- 
ing decision  tables  which  contain  "don't-care"  (-)  entries  since  the 
dashes  are  helpful  in  indicating  where  powers  of  two  occur.  For  example, 
the  case  C,  in  Figure  11a  could  be  represented  as  in  Figure  lib  with 
the  dash  (-)  signifying  that  p  is  an  undesirable  choice  as  far  as  case 

C,  is  concerned, 
k 
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pl 
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P2 

1       1 

P3 

1        1 
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Figt 
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pl 
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P2 
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P3 
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% 
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Figure  lib 


The  Quine-McCluskey  minimization  procedure  is  used  to  construct  the  table 
with  "don't-care"  entries.  For  n  <  k,    a  Karnaugh  map  is  also  useful. 
Let  us  use  the  Quine-McCluskey  procedure  to  convert  the  table  in 
Figure  6  to  a  table  containing  "don't-care"  entries.  Each  case  must 
be  handled  separately,  so  we  begin  with  C, •  First,  we  find  the  prime 
implicants  as  shown  below: 
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The  following  comments  are  made  in  regard  to  Figure  12: 

(1)  The  columns  in  Case  C  in  Figure  6  are  represented 
horizontally  as  complete  sequences  of  p.'s. 

(2)  To  facilitate  finding  derived  sequences,  the  complete 
sequences  (and  thus,  the  derived  sequences  also)  are  listed  in  order 
of  the  number  of  l's  they  contain. 

(3)  In  order  for  two  sequences  to  be  combined,  they  must  be 
identical  in  all  but  the  one  position  in  which  one  sequence  will  contain 
a  1,  the  other  a  0. 

(h)     When  two  sequences  are  combined,  they  are  checked  (v) 
in  the  table.  The  prime  implicants  are  all  those  sequences  which  are 
unchecked  after  no  more  sequences  can  be  combined  to  form  more 
derived  sequences. 

(5)  Every  sequence  must  be  compared  with  all  sequences  below 
it  which  contain  the  same  number  of  l's  or  one  more  1  than  the  given 
sequence  (even  if  some  of  these  sequences  are  already  checked),  and  all 
possible  derived  sequences  must  be  written. 

After  the  prime  implicants  have  been  obtained,  a  McCluskey 
Chart  is  constructed.  For  our  example,  the  following  chart  is  obtained. 
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In  the  new  table  containing  "don't-care"  entries,  Cn  is 
represented  as  follows: 
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The  table  entries  for  C?  and  CL  can  also  "be  derived  using 
the  Quine-McCluskey  procedure  to  yield  the  following  table: 
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Figure  15 

-Figure  15  could  also  have  been  all  or  partly  derived  from  the  following 
Karnaugh  map  which  can  be  obtained  directly  from  Figure  6.   Of  course, 
use  of  a  Karnaugh  map  is  only  practical  for  n  <  k. 


Entries  in  map  indicate  to  which 
case  the  corresponding  sequence 
of  l's  and  O's  belongs. 

<CD  =  Essential  Prime  Implicant 

-'.':-   =  Non-Essential  Prime  Implicant 


Figure  16 
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The  following  comments  are  in  regard  to  Figure  15: 
(l)  Essential  prime  implicants  are  represented  without 
brackets  ([])  with  one  important  exception:  when  two  essential  prime 
implicants  overlap  in  all  but  one  column  of  the  McCluskey  Chart  or  in 
all  but  one  box  of  a  Karnaugh  map  (that  one  column  or  box  being  the  one 
which  makes  each  of  the  two  essential  prime  implicants  essential),  then 
when  entering  the  two  essential  prime  implicants  in  the  table,  put 
brackets  around  each  and  insert  the  word  "or"  between  the  entries. 
For  example,  suppose  the  following  map  and  chart  illustrate  a  case 
C3(n=U): 
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Figure  17 


Both  -111  and  111-  are  essential  prime  implicants  and  they  overlap  in 
all  but  one  box  in  the  Karnaugh  map  and  all  but  one  column  of  the 
McCluskey  Chart  (-111  is  essential  because  it  contains  "7",  111-  because 
it  contains  "lV) .  Therefore,  the  case  C  would  be  represented: 
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.(2)  When  there  is  a  choice  of  which  non-essential  prime 
implicant  (or  combination  of  non-essential  prime  implicants)  to  select, 
list  all  choices  in  the  table,  each  in  brackets,  and  join  the  brackets 
by  the  word  "or".  Call  such  a  group  of  bracketed  entries  joined  by  the 
word  "or"  an  or-group  (each  bracketed  entry  will  be  called  a  component 
of  the  or-group),  and  define  the  or-number  of  an  or-group  to  be  the 
number  of  components  contained  in  the  or-group.  There  may  be  more  than 
one  or-group  in  a  case.  For  example,  a  case  could  have  the  following 
structure:   []or[]  []or[]or[].  The  first  or-group  has  an  or-number 
of  2,  the  second  an  or-number  of  3* 

(3)  Define  the  case-count  for  a  case  to  be  2  where 

r  =  (number  of  dashes  occurring  in 

non-bracketed  entries  in  the  case) 

+     Z         (maximum  number  of  dashes  in  any 
all  or-groups   component  of  the  or-group) 
in  the  case 

(U)  Define  the  dash-count  for  a  particular  p.  to  be  the  sum 
of  the  case-counts  corresponding  to  each  non-bracketed  dash  in  the  row 

plus  the  sum  of  f- - — j  <>  (corresponding  case-count)  for  each 

bracketed  dash  in  the  row. 

The  preceding  comments  and  definitions  lead  to  the  statement 
of  Algorithm  k   for  constructing  an  optimal  decision  tree.  The  notions 
of  case-count  and  dash-count  resemble  those  described  by  Pollack  in 
his  article,  "Conversion  of  Limited-Entry  Decision  Tables  to  Computer 
Programs"  (cited  earlier  in  this  paper). 
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3.1.   Algorithm  k   (Dash-Count) 

Step  1:   Using  a  Karnaugh  map  or  the  Quine-McCluskey 
procedure  plus  the  bracketing  rules  described  above,  construct  a 
table  containing  "don't-care"  entries. 

Compute  the  case-count  for  each  case. 

Compute  the  dash-count  for  each  p. . 

Select  the  p.  for  which  the  dash-count  is  a  minimum. 


Step 

2: 

Step 

3: 

Step 

k: 

If  more  than  one  p.  has  minimum  dash-count,  select  any  of  the  p. 's 
with  a  minimum  dash-count. 

The  case-counts  and  dash-counts  for  Figure  15  are  shown  below: 
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Since  pp  has  the  minimum  dash  count,  choose  pp  to  be  the  root  (pp  is 
indeed  the  root  of  both  optimal  trees).   Use  the  original  table  (not  the 
one  with  the  "don't-care"  entries)  to  create  two  subtables  and  apply 
Algorithm  k   to  each  of  the  subtables. 
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One  very  significant  advantage  of  Algorithm  k   over  the  other 
algorithms  is  that  cases  which  are  of  no  concern  in  a  particular 
problem  can  be  designated  as  such  (as  a  "d"  in  a  Karnaugh  map  or 
Quine-McCluskey  chart)  rather  than  having  to  be  combined  into  a  single 
'else"  case.  For  example,  suppose  we  have  a  problem  in  which  n  =  k 
and  in  which  we  assign  only  13  predicate  sequences  to  cases  C,  through 
Ci  .  We  have  no  concern  for  what  happens  to  the  three  sequences  0010, 
0011,  and  0111.   In  order  to  apply  our  earlier  algorithms,  we  would 
have  to  first  combine  these  three  "else"  sequences  into  a  single 
case--Cc;.  However,  for  Algorithm  h,    this  "else"  case  is  not  necessary. 
Suppose  the  original  table  is: 
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table : 
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The  following  Karnaugh  map  can  he  constructed  from  this 


Figure  21 


The  "d"  entries  provide  a  far  more  accurate  representation 
of  the  problem  than  would  a  fifth,  "else"  case;  and  as  the  Karnaugh 
map  indicates,  a  far  better  optimal  tree  will  result  when  the  "d" 
entries  are  used.  Figure  22  shows  an  optimal  decision  tree  obtained 
using  an  "else"  case  and  one  obtained  using  "d"  entries. 
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Optimal  Decision  Tree  Using  CV 
as  "Else"  Case  (8  nodes) 


Optimal  Decision  Tree 
Using  "d"  Entries  (5  nodes) 


Figure  22 


Algorithm  k   has  yielded  an  optimal  decision  tree  in  many 
examples  on  which  it  was  tested.  However,  Toshio  Yasui  (Ph.D.  student, 
University  of  Illinois)  has  found  the  following  counterexample  to 
Algorithm  k.     Given  the  decision  table  in  Figure  23, 
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Figure  23 

Algorithm  k   indicates  that  the  optimal  tree  would  have  p>  for  its 
root.   In  fact,  however,  all  the  optimal  trees  have  either  p.,  P~, 
or  p  for  their  roots  with  a  root  of  p,  yielding  no  optimal  trees. 
Therefore,  although  Algorithm  h   provides  an  efficient,  systematic 
method  for  finding  a  very  "good"  decision  tree,  it  will,  in  some 
situations,  fail  to  yield  the  optimal  decision  tree.  Algorithm  h   is 
judged,  however,  to  be  generally  more  reliable  than  the  algorithms 
presented  earlier  in  this  paper. 
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